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1  Introduction 


Useful  visual  information  can  be  extracted  from  optical  flow  —  a  2D  vector  field 
which  estimates  the  velocity  of  points  in  space  as  projected  on  the  image  plane.  For 
example,  images  can  be  segmented  at  discontinuities  of  optical  flow,  while  3 D  motion 
and  structure  parameters  can  be  computed  from  flow  features  in  several  ways  [Wax84, 
Kan85,  DN82b,  DN82a,  LHP80,  TH84,  WU85,  BH83,  WN86,  U1183].  A  thorough 
analysis  of  how  to  recover  optical  flow  robustly  from  a  sequence  of  time- varying  images, 
therefore,  is  important  for  understanding  the  strengths  and  weaknesses  of  different 
methods  that  can  be  used. 

In  this  paper,  a  number  of  algorithms  for  optical  flow  are  examined  on  both  theo¬ 
retical  and  experimental  grounds.  In  particular,  we  examine  differential  and  matching 
methods.  Emphasis  has  been  given  to  algorithms  which  have  a  very  simple  and  lo¬ 
cal  computational  structure.  We  contrast  algorithms  where  optical  flow  is  derived 
from  simple  local  processing  with  those  having  global  constraints.  The  former  meth¬ 
ods  may  require  some  simple  smoothing  to  achieve  coherence  while  the  latter  require 
global  smoothing  under  constraint  (regularization)[BPT88,  YG88]. 

The  flow  can  be  derived  either  by  differential  or  matching  methods.  Differen¬ 
tial  methods  estimate  image  velocity  from  spatial  and  temporal  variation  in  image 
brightness.  Matching  methods  search  for  displacements  which  bring  image  brightness 
features  into  correspondence. 

Our  analysis  suggests  several  points.  The  traditional  algorithms  for  optical  flow 
utilize  weak  assumptions  on  the  local  variation  of  the  flow,  and  on  the  variation  of 
image  brightness.  Strengthening  these  assumptions  makes  local  flow  computation 
possible.  The  computational  consequence  of  stronger  assumptions  is  a  need  for  larger 
spatial  and  temporal  support.  Using  larger  support  is  valid  when  constraints  on 
the  local  shape  of  the  flow  are  satisfied.  We  show  that  a  simple  constraint  on  the 
local  shape  of  the  optical  flow,  slow  spatial  variation  across  the  image  plane,  is  often 
satisfied. 

The  initial  measurements  in  differential  methods  include  various  spatial  and  tem¬ 
poral  derivatives  of  the  image.  Much  care  must  then  be  taken  in  numerical  imple¬ 
mentation  of  differentiation.  For  example,  if  the  scale  of  the  smoothing  filter  applied 
before  differentiation  is  comparable  with  the  size  of  the  inter-pixel  distance  ( <7  is  ap¬ 
proximately  1)  —  which  is  usually  the  case  —  accurate  numerical  approximation  of 
derivatives  requires  large  spatial  support.  Intuitively,  this  happens  because  the  inter¬ 
pixel  distance  is  not  “small”  with  respect  to  the  spatial  and  temporal  change  in  image 
brightness.  Moreover,  much  care  must  be  taken  in  mixing  numerical  approximations 
with  different  support.  This  fact  plays  a  crucial  role  in  analyzing  performance  of 
algorithms. 

Rather  general  conclusions  can  be  drawn  from  our  analysis.  Firstly,  local  algo- 
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rithms  provide  good  measurements  of  optical  flow  and  thus  are  particularly  promising 
as  inputs  to  later  visual  tasks.  Moreover,  suitable  differential  techniques  are  able  to 
produce  local  estimates  of  the  optical  flow  at  much  less  computational  expense  than 
global  methods. 

Secondly,  the  use  of  simple  constraints  on  the  local  shape  of  optical  flow  improves 
the  quality  of  results.  Localization  and  segmentation  require  precise  estimates  of  the 
optical  flow,  while  robustness  demands  an  assumption  of  smoothness  or  coherence  in 
the  flow.  These  goals  often  conflict.  The  better  the  estimates  in  the  initial  stage,  the 
less  stringent  need  be  the  constraints  aoplied  in  the  smoothing  stage. 

The  rest  of  the  paper  is  organized  in  three  sections.  Section  2  discusses  the  general 
structure  of  methods  for  optical  flow.  In  section  3  different  algorithms  are  considered. 
The  computational  assumptions  which  underlie  each  of  them  are  discussed  and  some 
novel  theoretical  arguments  introduced.  Experimental  results  are  always  presented 
to  corroborate  major  points.  Section  4  examines  the  performance  of  some  of  the 
algorithms  when  their  assumptions  do  not  hold.  The  concluding  section  summarizes 
our  results. 


2  General  Structure  of  Methods  for  Optical  Flow 

This  section  discusses  the  general  structure  of  several  rather  different  algorithms  which 
attempt  to  recover  optical  flow  from  a  sequence  of  time- varying  images. 

The  extraction  of  motion  information  from  a  sequence  of  images  is  a  difficult  task. 
Goals  and  computational  constraints  often  do  not  match,  and  sometimes  are  conflict¬ 
ing.  For  example,  local  optical  flow  estimates  can  be  refined  by  assuming  different 
degrees  of  spatial  smoothness.  On  the  other  hand,  discontinuity  of  optical  flow,  local¬ 
ization  of  features,  and  3 D  parameter  estimates  require  accurate  reconstruction  of  the 
optical  flow.  It  proves  useful  to  think  of  optical  flow  recovery  as  a  process  consisting 
of  two  steps.  In  the  first  step,  measurement ,  the  optical  flow  is  estimated  by  means 
of  local  techniques.  The  measurements  are  not  necessarily  very  accurate  but  should 
allow  for  a  complete  reconstruction  of  the  optical  flow,  which  is  carried  out  in  the 
second  step,  usually  a  smoothing  or  regularization  step. 

Indeed,  traditional  motion  studies  usually  start  from  the  assumption  that  the  first 
step  has  to  be  incomplete  and  ineffective  since  the  recovery  of  the  optical  flow  is  an 
underconstrained  problem  —  the  so-called  “aperture  problem'’  [MU81].  Recently,  how¬ 
ever,  this  assumption  has  been  successfully  challenged  [UGVT88a,  LBP88.  RSE8S]; 
several  studies  have  proved  experimentally  and  theoretically  that  there  is  enough  in¬ 
formation  in  a  sequence  of  images  to  give  an  accurate  measurement  of  local  motion 
in  a  single  step.  A  very  important  consequence  of  this  result  is  the  fact  that  often  no 
further  processing  is  required,  particularly  so  when  the  measurement  stage  depends 
upon  large  spatial  support,  as  in  matching  methods.  In  the  next  section,  we  will 
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examine  iu  some  detail  the  assumptions  underlying  these  studies,  and  the  structure 
and  computational  requirements  of  the  algorithms  that  arise  naturally  from  them. 
An  influential  algorithm  proposed  by  Horn  and  Schunck[HS81]  is  also  considered  as  a 
prototype  of  a  certain  way  to  look  at  the  problem  of  computing  the  optical  flow. 


3  Algorithms  for  Optical  Flow 

In  this  section  we  review  some  algorithms  for  optical  flow.  Emphasis  is  given  to 
computational  issues  and  theoretical  arguments  which  support  the  use  of  simple  but 
effective  constraints  to  produce  measurements  of  optical  flow  on  a  strictly  local  basis. 


3.1  Differential  Algorithms 

Optical  flow  information  can  be  extracted  by  making  assumptions  about  spatial  and 
temporal  variations  of  the  image  brightness.  If  E  =  E( xx,  x2,  t)  is  the  image  brightness 
at  the  location  x  =  (xx,x2)  on  the  image  plane  at  time  t  in  a  suitable  system  of 
orthogonal  coordinates,  the  weakest  possible  assumption  is  that  the  total  temporal 
derivative  of  the  image  brightness  vanishes  [FT79.  HS81]: 


(1) 


Eq.  1  embodies  the  “brightness  constancy”  assumption.  Since  this  is  a  very  simple 
assumption,  making  no  hypothesis  on  the  spatial  variation  of  the  optical  flow,  it  is 
the  starting  point  of  several  methods  for  computing  optical  flow.  Equation  1  can  be 
read  as  an  analytical  formulation  of  the  “aperture  problem”  —  because  from  Eq.  1  it 
is  only  possible  to  recover  the  component  of  the  optical  flow  in  the  direction  of  the 
spatial  gradient.  Equation  1  can  be  expanded  into  the  total  derivative  of  the  optical 
flow  v  =  (vi,  v2): 
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Note  that  Eq.  2  involves  only  first  spatial  and  temporal  derivatives  of  the  image 
brightness.  Methods  for  optical  flow  based  upon  Eq.  2  need  further  constraints  in 
order  to  determine  the  correct  flow  field.  Horn  and  Schunck  [HS81]  introduced  a 
smoothness  assumption  on  the  spatial  variation  of  the  optical  flow,  and  chose  the 
smoothest  vector  field  which  satisfies  Eq.  2  in  order  to  recover  a  unique  solution. 
They  also  proposed  an  iterative  algorithm  to  compute  the  solution,  which  we  have 
implemented  and  tested.  The  update  rule  in  the  iterative  scheme  is 
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where  superscripts  denote  the  step  and  t>i,u2  are  local  averages  of  ui,u2.'  Figure  1 
shows  a  “plaid”  pattern,  a  superposition  of  vertical  and  horizontal  sine  waves  of  the 
same  period  (64  pixels)  in  an  image  of  256  x  256  pixels.  The  pattern  is  translating 
across  the  image  plane  from  the  upper  left  corner  to  the  lower  right  corner  by  6.4 
pixels  per  frame.  The  Horn  and  Schunck  algorithm  recovers  the  correct  translational 
flow  for  this  pattern  (Fig.  2),  but  requires  many  iterations  of  the  iterative  regular¬ 
ization  (smoothing)  step.  Consider  the  effect  of  setting  A  to  zero  in  Eq.  3;  then  the 
regularization  (as  constructed)  will  choose  the  smoothest  velocity  field  derived  from 
repeated  averaging  of  the  initial  measurements.  The  number  of  iterations  of  such  a 
process  is  proportional  to  the  square  of  the  spatial  scale,  since  it  is  a  diffusion  pro¬ 
cess.  The  spatial  wavelength  is  large  in  this  case  (64  pixels),  necessitating  so  many 
iterations  to  propagate  information  from  regions  where  the  image  gradient  is  parallel 
to  the  flow  to  regions  where  the  image  gradient  is  normal  to  the  flow  (see  Fig.  3  for 
the  initial  normal  flow).  When  A  is  not  zero,  the  convergence  is  even  slower,  since  the 
subtracted  terms  in  Eq.  3  force  the  value  of  v  away  from  the  average  value  ( th ,  v2 ) 
toward  the  image  gradient.  Multigrid  methods  [Ter86]  can  be  used  to  overcome  some 
of  these  problems  of  scale. 

To  improve  the  measurement  step  without  enforcing  global  smoothness  constraints, 
a  simple  working  assumption  (examined  later  in  Section  4)  is  that  the  optical  flow 
does  not  change  appreciably  at  neighboring  pixels.  Then,  in  the  neighborhood  of 
every  pixel,  Eq.  2  can  be  rewritten  for  the  same  unknown  velocity.  At  each  pixel,  a 
linear  system,  possibly  overconstrained,  has  to  be  solved  (see  [LK81,  KTB87],  among 
others).  Every  equation  of  the  linear  system  for  the  optical  flow  v  =  (uj,t;2)  is  of 
the  form  of  Eq.  2,  and  v\  and  v2  are  assumed  to  be  constant  over  a  neighborhood  of 
every  pixel.  This  local  constraint  algorithm,  applied  to  the  sequence  of  Fig.  1,  using 
a  neighborhood  of  11x11  pixels,  produces  the  correct  translational  flow.  Note  that 
no  further  smoothing  step  has  been  implemented  but  that  a  larger  support  has  been 
used.  At  least  two  questions  arise  from  the  implementation  of  this  simple  algorithm: 
firstly,  what  can  be  said  about  the  assumption  on  the  local  structure  of  the  optical 
flow,  namely  that  the  optical  flow  is  locally  “approximately  constant”?  An  argument 
to  support  this  and  other  similar  constraints  is  presented  in  Section  4.  Secondly,  does 
this  formulation  lead  to  a  seriously  ill-conditioned  algorithm?  The  performance  of  the 
algorithm  is  dependent  on  the  local  variation  of  the  image  gradient;  the  neighborhood 
must  be  large  enough  to  contain  sufficient  variation  of  V£. 

Other  constraints,  instead  of  Eq.  2,  can  be  applied  for  optical  flow.  The  total 
temporal  derivative  of  several  other  quantities  vanishes  depending  on  the  kind  of 


'This  formula  (due  to  Horn)  corrects  an  error  in  the  update  rule  given  in  [Hor85],  p.  288. 
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Figure  1:  A  “plaid”  pattern,  a  superposition  of  vertical  and  horizontal  sine  waves  of 
the  same  period  (64  pixels).  The  pattern  is  translating  across  the  image  plane  from 
the  upper  left  corner  to  the  lower  right  corner  by  6.4  pixels  per  frame. 
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Figure  3:  Normal  flow  obtained  by  solving  Eq.  2  at  the  third  of  five  frames.  Im¬ 
age  brightness  derivatives  have  been  computed  by  using  five-point  approximation  of 
derivatives  derived  by  means  of  Taylor’s  expansion. 


6 


observed  motion  [VGT90].  For  example,  the  total  temporal  derivative  of  the  spatia 
gradient  vanishes  for  parallel  translation  on  the  image  plane  [UGVTSSb],  that  is. 

dVE 
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We  term  this  assumption  gradient  constancy.  Expanding  Eq.  4  leads  to  the  two 
constraints: 

d2E  d2E  d2E 


dx-l2Vl  dx\dx2  2  +  ' 


d2E 


02E 


dx2dx\  1  dx22  2  dx2dt 
at  each  pixel  in  the  image.  If  the  spatial  changes  in  the  optical  flow  can  be  assumed  to 
be  negligible  (see  Sec.  4)  then  Eq.  4  can  be  solved  for  the  optical  flow.  However,  since 
Eq.  4  provides  two  constraints  for  the  optical  flow,  in  principle  the  flow  it  produces 
needs  less  smoothing.  Again  this  has  been  obtained  by  using  larger  support  since 
it  requires  computing  second  derivatives  [Nag83a,  Nag83b,  HL83,  TP82,  N’agST].  At 
locations  where  the  two  scalar  equations  in  Eq.  4  are  linearly  dependent  —  that 
is,  where  the  determinant  of  the  Hessian,  the  matrix  of  second  derivatives  of  image 
brightness,  vanishes  —  there  is  not  enough  local  information  to  measure  the  optical 
flow.  Where  the  determinant  of  the  Hessian  is  small,  the  linear  system  is  often  ill- 
conditioned.  For  an  analysis  of  errors  in  local  constraint  methods,  see  [KTB87J. 
The  gradient  constancy  algorithm  performs  well  on  the  sequence  in  Fig.  1;  where  the 
Hessian  vanishes,  there  are  gaps  -  this  implies  the  need  of  a  small  amount  of  smoothing 
to  fill  in  gaps.  Interestingly,  the  local  constraint  method  and  gradient  constancy  are 
intimately  related.  It  is  possible  to  show  that  the  constraints  given  by  locally  solving 
the  brightness  constancy  equation  at  three  nearby  points,  assuming  constant  flow, 
imply  the  gradient  constancy  constraint. 

Consider  three  points  in  the  image,  a  central  point  pi  and  two  others,  p2.  displaced 
by  (Axx,0)  from  pi,  and  P3,  displaced  by  (0,  AX2)  from  p\.  The  spatial  and  temporal 
derivatives  of  brightness  at  pi  can  be  expanded  in  a  Taylor  series  around  p\ ,  letting 
superscripts  denote  the  point  at  which  derivatives  are  taken,  to  give 


di\dt 

d2E 


=  0 


=  0 


0) 


dE2 
di\ 
dE 2 

dx2 

d£2 

dt 


dE  1  d2E  1 
dx\  di\2 
dE 1  d2E  1 
dx2  dx2dx  1  1 


dE'  d2E 


+ 


Axx 


dt  dtdii 

ignoring  high  order  terms.  Substituting  these  terms  in  Eq.  2  at  p2  yields 

d2E  1  .  .  ,dE'  d2E  1 


.  dE 1  d2E\  x  dE 1 

(  ~ — T  ^^1  )^1  +  (  ^ +  ..  .-s 

OX  1  OX\  OX  2  CJX20X 


\  .  (dEl 

-  +  +dtdx 1 


(6) 


Axi)  =  0.  (7) 


t 


Rearranging  gives 


rdEx  dE{  dE\  (d2E  d2E  1  d2E 

{dTx  +d72  +  Hi  +{d^v'  +  v 2  + 


)Ax]  =  0. 


<■$) 


dtdxy 

The  first  sum  vanishes,  and.  after  division  by  Axi.  we  get  the  first  equation  in  Eq.  5. 
Similar  expansion  at  p3  using  second  derivatives  in  x2,  with  the  same  manipulations, 
yields  the  second  equation  in  Eq.  5.  Local  solution  of  the  brightness  constraint  equa¬ 
tion  at  three  points,  chosen  in  this  way,  thus  implies  the  gradient  constancy  constraint. 


3.2  Matching  Algorithms 

Let  us  consider  methods  which  match  features  from  two  images  to  derive  displace¬ 
ments  describing  the  optical  flow  field.  The  large  class  of  correlation-based  algorithms 
for  stereo  or  motion  are  matching  methods.  [Dev75,  Nis84,  Ana87,  LOY73.  MP76. 
KMJ77]  (see  [Hor85]  for  references).  As  a  prototype,  we  will  use  the  parallel  optical 
flow  algorithm  described  in  [LBP88].  This  algorithm  assumes  that  the  optical  flow 
is  locally  constant,  that,  for  each  point,  the  displacement  of  nearby  points  under  the 
optical  flow  is  the  same  as  the  displacement  of  the  central  point.  We  show  later 
(Section  4)  that  this  assumption  is  true  at  many  points  in  most  optical  flow  fields, 
particularly  since  the  matching  method  only  considers  displacements  at  multiples  of 
pixels.  A  second  assumption  states  that  the  effects  of  foreshortening  are  small:  note 
that  this  assumption  is  not  true  at  occluding  boundaries  of  rotating  objects.  In  stereo, 
this  assumption  is  more  often  violated,  since  the  viewing  angle  (the  angle  between  the 
projection  ray  of  a  surface  point  and  the  surface  normal)  varies  much  more  between 
the  two  images. 

The  structure  of  the  algorithm  is  as  follows.  At  each  point  in  the  image,  under 
each  integer  displacement,  the  image  in  the  two  frames  are  compared  and  a  measure 
of  the  matching  between  points  is  computed,  and  summed  over  a  small  region.  This 
can  be  interpreted  as  matching  small  patches  from  the  first  image  with  small  patches 
in  the  second.  Coherence  of  the  resulting  flow  field  is  achieved  as  a  result  of  the 
fact  that  the  support  regions,  the  patches,  have  large  overlap.  The  displacement  is 
chosen  to  maximize  the  matching  measure  over  all  displacements.  There  are  two 
important  parameters  in  this  algorithm:  the  size  of  the  summation  regions  and  the 
range  of  displacements  *o  be  considered.  For  a  further  discussion  of  this  algorithm 
and  parameter  choices,  see  [LB88]. 

The  algorithm  is  simple  and  local,  but  is  more  computationally  intensive  than 
the  measurement  stage  of  most  differential  methods,  since  the  number  of  comparison 
steps  depends  on  the  displacement  range.  The  optical  flow  obtained  by  using  this 
method  on  the  sequence  of  images  previously  considered  is  shown  in  Fig.  2.  We 
will  see  from  later  examples  of  the  matching  algorithm  that  the  algorithm  is  robust 
under  small  deviations  from  the  assumption  of  local  constancy  of  the  flow.  When 
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the  flow  results  from  parallel  translation  in  the  image  plane,  and  the  magnitude  of 
the  displacement  is  integral  and  within  the  range  of  displacements  considered,  the 
matching  algorithm  should  be  exact,  and  thus  can  serve  as  a  reference  in  some  of  our 
examples.  Note  that  the  gradient  constancy  method  also  produces  correct  estimates 
under  these  circumstances.  The  matching  measure  used  in  experiments  reported  here 
is  the  squared  difference  of  the  smoothed  brightness  over  patches  from  the  first  and 
second  images.  Valid  measurements  result  at  locations-where  the  image  Hessian  does 
not  vanish.  The  constraint  of  gradient  constancy  leads  to  a  matching  criterion  for 
the  matching  method,  the  squared  difference  of  the  image  gradients,  that  can  also  be 
effectively  used  in  the  matching  algorithm. 


4  Verifying  Assumptions 

In  this  section  we  deal  with  computational  and  numerical  assumptions  made  to  re¬ 
cover  the  optical  flow.  Sequences  of  a  rotating  circle  and  of  a  planar  surface  which 
translates  toward  the  observer  are  considered.  Finally,  the  numerical  implementation 
of  derivatives  for  optical  flow  is  discussed  in  detail. 


4.1  Local  Properties  of  the  Optical  Flow 

Let  us  consider  the  simple  case  in  which  a  planar  surface  is  translating  in  space.  The 
optical  flow  v  =  (iq,  u2)  can  then  be  written  as 


-  T3  _  _ 

V  =  — Q  ■  x(X  —  X  ) 

/  7 


(9) 


where  x  =  (x1,ar2l  /)  is  the  perspective  projection  onto  the  image  plane  of  the  point 
,Y  in  a  system  of  coordinates  defined  by  the  triple  of  mutually  orthogonal  unit  vectors 
(e*i,  e2,  eb).  The  center  of  projection  lies  at  the  origin  and  the  image  plane  has  equation 
X3  =  X  ■  e3  =  /.  The  point  x*  =  fT/T3,  T3  =  T  ■  e3,  is  the  singular  point  of  the 
optical  flow,  or  the  point  where  the  optical  flow  vanishes.  The  vector  a  =  (qi.q2,q3) 
is  the  unit  normal  to  the  planar  surface  which  has  equation 


7  =  a  ■  X  (10) 

Note  that  x  is  now  considered  a  3 D  vector.  Its  third  component  is.  however,  always 
equal  to  the  focal  length  /.  We  want  to  estimate  the  spatial  variation  of  the  optical 
flow  across  the  image  plane.  Let  us  define  6vtJ ,  the  relative  spatial  variation  of  v,  in 
the  direction  e,,  as  follows: 

I  ?  Vt 

Sv'’=M  ,1,) 
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where  |j*-  is  the  spatial  partial  derivative  of  t i  =  1,2,  in  the  direction  e,,  j  —  1,2. 
and  ||ul|  ’is  the  magnitude  of  the  optical  flow  at  x.  The  relative  spatial  variations 
are  good  measures  of  the  magnitude  of  spatial  variation  of  the  optical  flow,  because 
they  do  not  depend  on  the  magnitude  of  the  flow  field.  Previous  examinations  of  the 
uniformity  of  optical  flow  [KTB87]  formulate  a  similar  measure  of  flow  variation  and 
derive  an  approximate  bound  for  the  flow  induced  by  a  planar  surface,  although  under 
more  restricted  assumptions  on  viewing  geometry. 

Now,  straightforward  computation  of  the  partial  derivatives  of  the  u,  leads  to  the 
following  inequality 


dv{ 


dij 


<  j^(\a-x\  +  \\x-  xs| 


while  for  the  magnitude  of  the  optical  flow  we  have 


v  = 


la  •  x|  •  || x  —  x4| 

fh\ 


Thus,  since  ||x||  >  /,  one  obtains 


1 


Sv‘>  -  lisJaMi  + 


1 


(12) 


(13) 


(14) 


f  cos  0 

where  6  is  tl  ngle  between  the  normal  vector  a  and  x.  The  focal  length  /  is  typically 
103  pixels.  Inerefore,  when  the  angle  is  smaller  than  20  degrees,  at  locations  thirty 
pixels  away  from  the  singular  point  the  relative  spatial  variations  are  less  than  5% 
percent  per  pixel.  For  angles  as  wide  as  85  degrees,  at  least  ten  pixels  away  from  the 
singular  point  the  relative  spatial  variation  is  still  less  than  about  10%  per  pixel. 

A  similar  argument  for  pure  rotation  leads  to  the  following  inequality 


6vi}  <  - = (15) 

|u;3|||x  —  x4||Vl  —  2  cos2  0 

where  x4  =  fu/u 3,  cu3  =  u>  ■  e3  is  the  singular  point  of  the  optical  flow  and  0  is  the 
angle  between  x  —  x3  and  x.  In  the  case  in  which  u>  =  u >e3  (when  the  rotational  and 
optical  axis  coincide),  for  example,  Eq.  15  can  be  rewritten  as 


8vi,  < 


~  —  x3 1 


||x 


(16) 


which  is  similar  to  Eq.  14.  For  «3  =  0  or  |o?3|  <<  ||u>||,  a  similar  conclusion  can  be 
obtained. 

The  above  analysis  can  easily  be  extended  to  an  arbitrary  smooth  surface,  due 
to  its  local  nature.  The  only  difference  is  that,  since  the  normal  to  the  surface  is 
changing,  a  must  change  accordingly.  As  intuitively  expected,  therefore,  locations  of 
strong  spatial  variation  lie  near  occluding  boundaries  (that  is,  where  the  normal  to 
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the  surface  is  almost  perpendicular  to  the  projecting  ray)  and  probably  not  far  away 
from  discontinuities.  Horn  and  Schunck[HS81]  show  that  (in  the  case  of  rotational 
motion) 

V2yt  =  u2 V2x3,  V2u2  =  u>iV2x3,  (17) 

showing  that  “the  smoothness  of  the  optical  flow  is  directly  related  to  the  smoothness 
of  a  rotating  body”. 

The  above  arguments  suggest  that  the  local  shape  of  the  optical  flow  is  indeed 
fairly  simple.  Due  to  discretization  errors  and  noise,  at  most  locations,  optical  flows 
are  likely  to  be  locally  indistinguishable  from  constant  vector  fields.  Therefore,  as¬ 
sumptions  like  local  constancy,  i.e.,  slow  variation  of  the  optical  flow  across  the  image 
plane,  capture  a  fundamental  local  property  of  optical  flow,  entirely  due  to  the  simple 
structure  of  3 D  rigid  motion  which  has  been,  as  usual,  implicitly  assumed. 

4.2  Non-constant  Velocity  Fields 

We  consider  the  behavior  of  the  algorithms  when  the  assumptions  of  small  spatial 
variation  of  the  velocity  are  violated,  for  example,  in  the  cases  of  rotational  and 
looming  fields. 

4.2.1  Rotational  Field 

First  we  examine  a  sequence  of  images  of  a  circle  of  random  grey  levels  rotating  on 
the  image  plane  around  its  center  at  a  fixed  angular  velocity  of  3  degrees  per  frame, 
resulting  in  maximum  displacement  of  6.7  pixels.  A  random  grey  level  circle  rotates 
before  a  random  grey  level  background  parallel  to  the  image  plane.  The  radius  of 
the  circle  is  128  pixels,  random  grey  levels  are  uniformly  distributed  in  the  range 
0..255,  and  the  dots  are  2  by  2  pixels,  finally  smoothed  by  a  Gaussian,  a  =  2.  The 
output  of  the  Horn-Schunck  algorithm  is  shown  in  Fig.  5;  large  values  of  A,  the  regu¬ 
larization  parameter,  select  smoother  optical  flows.  The  results  of  the  local  constraint 
algorithm,  and  the  gradient  constancy  algorithm  (Eq.  4)  are  shown  respectively  in 
Fig.  6  and  Fig.  7.  The  result  of  the  matching  algorithm  is  shown  in  Fig.  8.  Both 
the  local  constraint  algorithm  and  the  gradient  constancy  algorithm  are  followed  by 
a  smoothing  step  to  restore  coherence.  Convolving  each  of  the  two  components  of 
the  optical  flow  resulting  from  the  measurement  step  with  a  2 D  symmetric  Gaussian 
function  produces  more  coherent  optical  flows.  Note  that  the  theoretical  argument  of 
the  previous  section,  that  small  relative  variation  in  the  optical  flow  can  be  assumed, 
is  clearly  verified. 

We  have  also  compared  the  optical  flow  fields  generated  by  the  various  algorithms 
with  the  true  projected  velocity  field,  which  is  available  since  these  are  synthetic 
images.  Differential  algorithms  produce  real-valued  flow  fields,  while  the  matching 


11 


Figure  4:  A  random  grey  level  disc  rotates  on  a  random  grey  level  background  parallel 
to  the  image  plane.  The  disc  radius  is  128  pixels;  grey  levels  are  uniformly  distributed 
from  0..255;  the  dots  are  2  by  2  pixels,  smoothed  by  a  Gaussian,  a  =  2. 
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Figure  5:  The  optical  flow  field  (rotating  circle)  produced  by  the  Horn-Schunck  algo¬ 
rithm,  as  in  [Hor85],  after  400  iterations  of  a  numerical  algorithm  with  A  =  1. 
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Figure  6:  The  optical  flow  field  (rotating  circle)  generated  by  the  local  constraint 
algorithm,  followed  by  Gaussian  smoothing  (on  each  component),  with  cr  =  3. 


Figure  7:  The  optical  flow  field  (rotating  circle)  generated  by  the  gradient  constancy 
algorithm,  followed  by  Gaussian  smoothing  (on  each  component),  with  a  =  3. 
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Figure  8:  The  optical  flow  field  (rotating  circle)  produced  by  the  matching  algorithm 
using  a  displacement  range  of  ±8  pixels. 
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Algorithm 

Aver,  cosct 

Aver.  \W\ 

bESEEi 

Horn-Schunck  (100  steps) 

0.976 

0.904 

0.202 

Horn-Schunck  (400  steps) 

0.977 

0.914 

0.205 

Local  constraint 

0.992 

0.645 

0.157 

Gradient  constancy 

0.991 

0.744 

0.165 

Matching  (rounded) 

0.994 

0.196 

0.052 

Matching  (exact) 

0.992 

0.422 

0.129 

Matching,  interpolated  (exact) 

0.994 

0.252 

0.082 

Table  1:  Comparison  of  true  and  computed  velocity  fields  -  rotating  image. 

algorithm  generates  integer  displacements.  For  the  differential  algorithms,  the  exact 
field  V  is  compared  with  the  computed  outputs  Vc.  The  output  of  the  matching 
algorithm  is  compared  both  with  the  exact  field,  and  the  field  rounded  to  integer 
values.  Moreover,  the  output  of  the  matching  algorithm,  the  matching  scores  at  inte¬ 
gral  displacements,  was  interpolated  to  half  pixel  increments,  resulting  in  significant 
improvement  in  the  error  measures.  Interpolated  data  for  matching  reduces  the  errors 
both  in  this  example  and  the  next  by  35%. 

Several  measures  were  computed  from  this  comparison:  the  average  of  cosa ,  the 
cosine  of  the  angle  between  the  true  and  computed  velocity  vectors  ,  the  average 
length  of  W  =  V  —  Vc,  the  vector  difference  between  the  true  and  computed  velocities 
and  the  average  of  Note  the  slow  improvement  of  the  Horn-Schunck  algorithm; 
this  flow  field  and  the  looming  field  are  both  quite  smooth,  but  the  regularization 
stage  needs  many  iterations. 

An  interesting  question  is  why  the  performance  of  differential  techniques  degrades 
when  moving  from  translational  to  rotational  velocity  fields  while  matching  tech¬ 
niques  still  produce  good  results  when  both  are  based  on  similar  constraints.  The 
answer  is  likely  to  be  intimately  connected  to  the  nature  of  differential  and  match¬ 
ing  techniques.  Differential  techniques  like  local  constraint  and  gradient  constancy 
need  some  smoothing  of  the  input  data.  During  the  smoothing  step,  brightness  in¬ 
formation  from  locations  with  different  velocities  are  mixed,  introducing  error  into 
the  differential  measurements.  The  matching  algorithm  does  not,  in  principle,  need 
smoothing,  since  it  does  not  take  derivatives.  Another  factor  in  this  and  the  next 
example  is  that  a  wide  range  of  velocities  are  present.  Differential  algorithms  depend 
on  the  brightness  derivatives  being  well  approximated  by  linear  (first  derivative)  and 
quadratic  terms  (second  derivative).  The  images  are  smoothed  by  a  Gaussian  before 
differentiation,  but  no  one  scale  of  Gaussian  suffices  to  smooth  brightness  enough  at 
high  velocities  while  not  over- blurring  locations  with  small  velocities. 
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Algorithm 

Aver,  cosa 

Aver.  |W| 

mm 

Horn-Schunck  (100  steps) 

0.942 

0.463 

0.321 

Horn-Schunck  (400  steps) 

0.943 

0.450 

0.314 

Local  constraint 

0.960 

0.316 

0.174 

Gradient  constancy 

0.977 

0.230 

0.105 

Matching  (rounded) 

0.988 

0.115 

0.062 

Matching  (exact) 

0.980 

0.405 

0.247 

Matching,  interpolated  (exact) 

0.992 

0.211 

0.160 

Table  2:  Comparison  of  true  and  computed  velocity  fields  -  looming  image. 

4.2.2  Looming  Field 

Let  us  now  consider  another  simple  kind  of  motion,  that  of  a  planar  surface  translating 
in  space  towards  the  observer  (looming).  We  have  compared  the  performance  of  the 
algorithms  in  the  case  of  a  looming  planar  surface;  its  appearance  is  similar  to  the 
textured  surface  used  in  the  preceding  example. 

The  Horn-Schunck  algorithm  behaves  well  (see  Fig.  9).  Differential  techniques 
based  on  larger  support,  like  local  constraint  and,  in  particular,  the  gradient  constancy 
technique,  are  not  influenced  by  the  spatial  structure  of  the  gradient  (see  Fig.  10  and 
11  respectively).  The  matching  algorithm  appears  not  to  produce  any  output  in 
the  immediate  neighborhood  of  the  singular  point  of  the  optical  flow  —  a  focus  of 
expansion  —  because  displacements  in  the  neighborhood  of  the  focus  of  expansion 
are  less  than  one  pixel  (see  Fig.  12),  but  does  correctly  compute  zero  motion  (the 
rounded  approximation  to  small  values)  near  the  singular  point. 

4.3  Numerical  Differentiation 

In  implementing  and  testing  these  algorithms,  we  directly  confronted  many  of  the 
difficulties  in  computing  derivatives  of  images.  As  a  concluding  point  of  our  analysis, 
we  examine  briefly  the  problems  underlying  numerical  implementation  of  derivatives 
of  image  brightness.  Typically,  computer  vision  algorithms  —  with  rare  exceptions,  for 
example,  [TP86]  —  use  two-point  and  three- point  approximation  formulae  to  compute 
first  and  second  derivatives  of  image  brightness.  For  example,  an  analysis  of  errors  in 
optical  flow  gradient  methods  [KTB87]  uses  only  the  forward  difference.  In  essence, 
it  is  taken  for  granted  that  the  inter-pixel  distance  in  space  (or  average  displacements 
over  the  image  plane  between  consecutive  frames  in  time)  is  very  ‘‘small”.  A  closer 
analysis,  however,  shows  that  this  is  equivalent  to  assuming  that  the  filter  function 
used  before  taking  derivatives  is  oversampled  in  space  (or  time),  i.e.,  that  it  has  a 
“large”  scale.  To  study  the  effects  of  various  finite  difference  methods,  we  use  the  fact 
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Figure  9:  The  optical  flow  (looming  image)  produced  after  400  iterations  by  the 
Horn-Schunck  algorithm,  as  in  [Hor85],  with  A  =  1. 


Figure  10:  Optical  flow  (looming  image)  from  the  local  constraint  algorithm.  The 
optical  flow  is  assumed  to  be  constant  in  an  11  x  11  neighborhood.  The  vector  field 
is  convolved  with  a  symmetric  Gaussian  function  of  standard  deviation  cr  =  3. 
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Figure  11:  Optical  flow  (looming  image)  from  the  gradient  constancy  algorithm.  The 
vector  field  is  convolved  with  a  symmetric  Gaussian  function  of  standard  deviation 
<7  =  3. 


Figure  12:  Optical  flow  (looming  image)  produced  by  the  matching 
a  displacement  range  of  ±8  pixels. 


algorithm  using 
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that  differentiation  commutes  with  convolution  and  examine  the  approximations  to 
the  first  and  second  derivatives  of  a  ID  Gaussian  generated  by  convolving  the  Gaussian 
with  several  finite  difference  operators.  Figures  13-16  show  some  characteristic  graphs. 
The  curves  plot  the  true  derivative  of  a  ID  Gaussian  G,  and  samples  of  several 
numerical  approximations:  the  forward  or  finite  difference  (two-point  formula), 


/'(*)  =  /(*  +  !)-/(*), 


(18) 


central  or  symmetric  difference  (three-point  formula), 


/'(x)==(/(x  +  l)-/(x-l))/2,  (19) 

four- point  formula, 

f\x)  =  (/(x  -  1  )-/(*  +  2))/24  +  27(/(x  +  1)  -  /(*))/ 24,  (20) 

and  five-point  formula, 

f\x)  =  2(/(x  4- 1)  -  /(x  -  l))/3  -  (/(x  +  2)  -  f(x  -  2))/12,  (21) 


at  the  pixel  x.  In  the  formulae  above,  the  interpixel  distance  is  unity. 

The  comparison  between  the  second  derivative,  /",  of  a  ID  Gaussian  function  / 
and  numerical  approximation  of  it  is  obtained  by  using  symmetric  difference, 


f"(x)  =  f(x  +  1)  -  2/(x)  +  f(x  -  1),  (22) 


and  five-point  support, 

/"(x)  =  4(/(x  +  1)  -  2/(x)  +  /(x  -  l))/3  -  (/(x  +  2)  -  2/(x)  +  /(*  -  2))/12,  (23) 
at  the  pixel  x. 

The  reference  curve  in  each  of  the  figures  is  plotted  with  open  circles.  For  small 
values  of  cr,  the  central  difference  for  the  first  derivative  (squares  in  Fig.  13)  and 
three-point  approximation  for  the  second  derivative  (triangles  in  Fig.  15)  are  clearly 
insufficient.  The  error  for  the  approximation  A  is  computed  at  integer  values  i  for 
i  <  3<7  and  the  quantity  shown  is: 

Si*1  W)  -  g(Q| 

|G'(i)l 
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The  following  table  summarizes  the  errors  for  various  approximations.  Note  that 


ism 

2-point 

3- point 

4- point 

5-point 

6-point 

7-point 

1.0 

0.0737 

0.258 

0.0153 

0.0875 

0.00647 

0.0495 

1.5 

0.0333 

0.126 

0.00523 

0.0339 

0.00171 

0.0135 

2.0 

0.0192 

0.0745 

0.00193 

0.0126 

0.000380 

0.00341 

3.0 

0.00845 

0.0334 

0.000392 

0.00271 

0.0000376 

0.000364 

4.0 

0.00474 

0.0188 

0.000129 

0.000899 

0.00000696 

0.0000689 

Table  3:  Normalized  summed  absolute  errors  for  first  derivative  of  Gaussian. 

the  error  for  the  central  difference  only  becomes  less  than  5%  for  a  >=  3.  This 
example  demonstrates  that  these  estimators  are  accurate,  as  theory  predicts,  when 
the  higher-order  derivatives  of  the  function  being  approximated  are  small.  For  the  first 
derivative  of  the  Gaussian,  these  derivatives  become  small  when  a  >  3  or  x  becomes 
large.  But,  when  these  conditions  are  not  met,  it  is  clearly  superior  to  use  larger 
support  for  the  approximation.  Interestingly,  the  estimators  with  an  even  number  of 
points  are  consistently  better  than  the  odd  numbered  estimators.  For  example,  at 
<7  =  1,  it  is  necessary  to  use  the  7-point  estimate  to  achieve  accuracy  comparable  with 
the  2-point  estimate. 

But  there  is  a  problem  with  these  even  number  operators.  For  the  first  derivative, 
the  estimators  with  an  even  number  of  points,  the  forward  (two-point)  and  the  four- 
point  estimator,  are  shown  displaced  leftward  by  —0.5.  Both  actually  estimate  the 
first  derivative  at  the  point  x  =  0.5.  For  large  cr,  the  relative  shift  is  small.  Three- 
and  five-point  formulae  for  the  finite  difference  estimate  the  derivative  at  x  =  0. 
Any  computation  which  mixes  the  forward  difference  operator  for  the  first  derivative 
(which  is  shifted)  with  the  three-point  operator  for  the  second  derivative  (unshifted) 
runs  into  difficulties.  The  bias  of  the  forward  difference  operator  shows  up  as  a  phase 
shift,  dependent  on  the  image  frequency,  when  the  frequency  response  characteristics 
of  the  operator  are  examined. 

Let  us  examine  the  z-transform  of  the  forward-difference  operator  z~l  —  1.  The 
Fourier  transform  of  this  operator  is 


;-l-l|U  =  e-^-l, 

(25) 

which,  factored,  is 

e-;w/2(e-;w/2  _  ejw/ 2^ 

(26) 

which  reduces  to 

cj(»/2— w/2  )2sinuJ/2. 

(27) 

For  small  angles,  sinus  approximates  w,  giving 

cj(»/2— «/2)w> 

(28) 
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The  Fourier  transform  of  the  derivative  operator  is 

e:*nu.  (29) 

Comparing  these  two  formula  we  see  that  the  forward-difference  operator  is  an  ad¬ 
equate  approximation  for  the  derivative  operator  only  for  small  u,  but.  even  there, 
there  is  a  small  phase  shift,  proportional  to  u>/2.  The  other  operators  do  not  introduce 
such  a  phase  shift. 

These  observations  explain  the  symmetric  treatment  of  spatial  and  temporal  deriva¬ 
tives  in  [Hor85]  (p.  289).  There  the  forward  difference  operator  is  used,  both  in  space 
and  time.  Our  analysis  indicates  that  its  use  must  be  accompanied  by  significant 
smoothing  of  the  image. 

Another  consequence  of  the  preceding  analysis  of  numerical  differentiation  is  that 
one  can  improve  the  results  of  implementations  of  optical  flow  methods  simply  bv 
improving  the  accuracy  of  the  measurement  stage.  As  an  example,  consider  the  local 
constraint  method;  it  does  not  require  subsequent  smoothing,  so  the  effects  of  im¬ 
proved  measurements  should  directly  appear  in  the  results.  In  the  results  we  reported 
above,  we  used  Horn’s  symmetric  spatial/temporal  differentiators.  We  replaced  the 
x  and  y  components  of  these  operators  by  the  4-point  operators.  The  results  for 
the  rotating  image  figure  show  minimal  improvement,  but  other  similar  images  show 
substantial  improvement,  reducing  the  length  of  the  error  vector  by  up  to  25%. 

Similarly,  the  gradient  constancy  utilizes  temporal  derivatives  of  dE/dx  and  dE/dy. 
The  output  of  that  algorithm  is  most  sensitive  to  the  temporal  derivatives;  changing 
the  operator  from  the  3-point  to  the  5-point  operator  for  the  first  derivative  reduces 
error  by  30%. 

Numerical  differentiation  is,  in  general,  ill-posed,  and  the  choice  of  operator  must 
be  influenced  by  analyis  of  the  task  and  the  noise  in  the  image  [TP86].  It  is  important 
to  note  that,  while,  over  space,  larger  support  can  be  utilized  with  only  a  small 
increase  in  computation,  but  over  time  the  situation  is  rather  different,  requiring 
more  images.  Temporal  differentiation  is  more  critical  to  differential  methods  than 
spatial  differentiation.  The  difference  in  sampling  in  space  and  time  is  significant, 
and  usually  several  orders  of  magnitude.  In  estimation  of  the  temporal  derivative,  the 
phase  shift  induced  by  the  derivative  estimators  with  even  numbers  of  points  becomes 
considerable,  given  the  relatively  large  spacing  between  sample  points.  Considerable 
care  must  be  taken  to  handle  the  estimates  appropriately.  The  symmetric  operator 
used  by  Horn  averages  spatial  derivates  over  time  and  temporal  derivatives  over  space 
so  that  estimates  are  taken  at  the  same  time  in  space-time,  resulting  in  accurate 
estimates  that  do  not  suffer  from  the  phase  shift. 
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Figure  13:  First  derivative  of  Gaussian  function  (a  —  1):  Gaussian  (open  circle), 
forward  difference  (square),  central  difference  (triangle),  four-point  (diamond),  and 
five- point  (star). 


Figure  14:  First  derivative  of  Gaussian  function  {a  =  2):  Gaussian  (open  circle), 
forward  difference  (square),  central  difference  (triangle),  four-point  (diamond),  and 
five-point  (star). 
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Second  Derivative  —  Sigma  =  1 


Figure  15:  The  second  derivative  of  the  Gaussian  function  (cr  =  1):  derivative  of  the 
Gaussian  (open  circle);  central  difference  (triangle);  five-point  formula  (star). 


Second  Derivative  —  Sigma  =  2  ' 

I 


Figure  16:  The  second  derivative  of  the  Gaussian  function  (a  =  2):  derivative  of  the 
Gaussian  (open  circle);  central  difference  (triangle);  five-point  formula  (star). 
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5  Conclusion 


We  have  shown  that  local  algorithms  can  provide  good  direct  measurements  of  the 
optical  flow,  and  can  locally  solve  the  aperture  problem.  These  algorithms  require 
a  simpler,  less  computationally  demanding  smoothing  stage  than  global  algorithms 
which  begin  with  normal  motion.  Since  less  smoothing  is  used,  localization  and  dis¬ 
continuity  identification  are  likely  to  be  better.  These  improved  algorithms  inquire 
larger  spatial  support,  and  thus  need  to  make  stronger  assumptions  about  the  optical 
flow.  We  have  shown  that,  in  the  cases  of  pure  translation  and  pure  rotation,  the  pro¬ 
jected  velocity  field  which  produces  the  optical  flow  will  meet  the  local  requirements 
of  slow  variation  except  close  to  singular  points  of  the  velocity  field. 

We  have  implemented  these  algorithms  on  the  Conr^  -tion  Machine  [LBC89]  and 
tested  them  on  various  real  and  synthetic  images,  to  test  their  sensitivity  to  violations 
of  their  assumptions.  It  is  clear  from  both  analysis  and  experience  in  the  implemen¬ 
tation  that  care  must  be  taken  in  computing  derivatives.  Interestingly,  the  methods 
which  produce  improved  measurements  have  much  in  common:  they  use  larger  spatial 
support  and  they  rely  on  similar  assumptions  on  the  local  behavior  of  the  velocity 
field. 
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